HPAVC

home *** CD-ROM | disk | FTP | other *** search

/ HPAVC / HPAVC CD-ROM.iso / PCASM2.ZIP / CHAP8.DOC < prev next >

Wrap

Text File | 1990-06-21 | 31KB | 736 lines

56 CHAPTER 8 - SHIFT AND ROTATE There are seven instructions that move the individual bits of a byte or word either left or right. Each instruction works slightly differently. We'll make a standard program and then substitute each instruction into that program. SAL - SHL The instructions SHL (shift logical left) and SAL (shift arithmetic left) are exactly the same. They have the same machine code. They shift each bit to the left. How far? That depends. There are two (and only two) forms of this instruction. All other shift and rotate instructions have these two (and only these two) forms as well. The first form is: shl al, 1 Which shifts each bit to the left one bit. The number MUST be 1. No other number is possible. The other form is: shl al, cl shifts the bits in AL to the left by the number in CL. If CL = 3, it shifts left by 3. If CL = 7, it shifts left by 7. The count register MUST be CL (not CX). The bits on the left are shifted out of the register into the bit bucket, and zeros are inserted on the right. The easy way to understand this is to fire up the standard program. Remember, from now on we always use template.asm. ;sal.asm ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE mov ax_byte, 0A3h ; half reg, low reg binary mov bx_byte, 0A4h ; half reg, low reg hex mov cx_byte, 0A1h ; half reg, low reg signed mov dx_byte, 0A2h ; half reg, low reg unsigned lea ax, ax_byte call set_reg_style mov ax, 0 ; clear registers mov bx, 0 mov cx, 0 mov dx, 0 mov di, 0 mov bp, 0 call show_regs outer_loop: call get_hex_byte ; get number and put in registers mov bl, al mov cl, al ______________________ The PC Assembler Tutor - Copyright (C) 1989 Chuck Nelson Chapter 8 - Shift and Rotate 57 ____________________________ mov dl, al mov si, 8 ; 8 iterations of the loop and al, al ; set the flags call show_regs_and_wait shift_loop: sal al, 1 sal bl, 1 sal cl, 1 sal dl, 1 call show_regs_and_wait dec si jnz shift_loop jmp outer_loop ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE This standard program is with bytes, not words. This is because if we had used words we would have performed 16 individual shifts and that would have been time consuming and boring. First we set the style to half registers. Notice that one is binary, one is hex, one is signed and one is unsigned. That covers all bases. All the registers are then cleared. It would be nice to use the loop instruction, but CX is committed, so we make our own loop instruction. We move 8 into SI. The loop instructions are: dec si jnz shift_loop DEC decrements a register or a variable by 1. Its counterpart INC increments a register or variable by 1. JNZ (jump if not zero) jumps to 'shift_loop' if SI is not zero. We get a hex byte in AL and put the same byte in BL, CL, and DL. This way we will be able to see what is happening in binary, hex, signed and unsigned. Before starting, we have: and al, al This is there to set the flags correctly before starting. All four are shifted left one bit each time, and then we look at the result. Assemble, link and run it. Enter the number 7. In binary, that is (0000 0111). Take a look at the flags before starting. It is a positive number so SF shows '+'. ZF is not set. PF shows 'O'. O stands for odd. Every time you perform an arithmetic or logical operation, the 8086 checks parity. Parity is whether the number contains an even or odd number of 1 bits. This contains 3 1 bits, so the parity is odd. The possible settings are 'E' for even and 'O' for odd.{1} SAL checks for parity (though some of the other instructions don't). Now press ENTER. It will shift left 1 and you will have (0000 1110). What does the unsigned number say now? 14. Press ENTER again. (0001 1100) What does the unsigned number say? 28. Again (0011 1000) 56. Again (0111 0000) 112. Notice that ____________________ 1 This is for use by communications programs. The PC Assembler Tutor 58 ______________________ the signed number reads +112. Look at the CF and OF. They are both cleared. Things are going to change now. Press ENTER again. (1110 0000). SF is now '-'. OF, the overflow flag is set because you changed the number from positive to negative (from +112 to -32). What is the unsigned number now? 224. CF is cleared. PF is '0'. Shift again. (1100 0000) OF is cleared because you didn't change signs. (Remember, the leftmost bit is the sign bit for a signed number). PF is now 'E' because you have two 1 bits, and two is even. CF is set because you shifted a 1 bit off the left end. Keep pressing ENTER and watch SF, OF, CF, and PF. Let's look at the unsigned numbers we had until we started shifting 1 bits off the left end. We started with 7, then had 14, 28, 56, 112, 224. This instruction is multiplying by 2. That's right, and it is MUCH faster than multiplication (about 50 times faster). Far and away the fastest way to multiply a register by 2, 4 or 8 is to use sal. ; by 2 ;by 4 ; by 8 sal di,1 sal di, 1 sal di, 1 sal di, 1 sal di, 1 sal di, 1 For a register, it is faster to use a series of 1 shifts than to load cl. For a variable in memory, anything over 1 shift is faster if you load cl. Do a few more numbers to see what is happening both with the number and the flags. CF always signals when a 1 bit has been shifted off the end. SAR and SHR Unlike the left shift instruction, there are two completely different right shift instructions. SHR (shift logical right) shifts the bits to the right, setting CF if a 1 bit is pushed off the right end. It puts 0s in the leftmost bit. Make a copy of SAL.ASM and replace the four instructions: sal al, 1 sal bl, 1 sal cl, 1 sal dl, 1 with SHR. We'll call the new program SHR.ASM. Run this one too. Instead of 7, use E0h (1110 0000) which is 224d. The first time you shift (0111 0000) the OF flag will be set because the sign changed. Keep shifting, noting the flags and the unsigned number. This time we have 224, 112, 56, 28, 14, 7, 3, 1. It is dividing by two and is once again MUCH faster than division. For a single shift, the remainder is in CF. For a shift of more than one bit, you lose the remainder, but there is a way around this which we will discuss in a moment. Do some more numbers till you are comfortable with the flags and the operation. If you want to divide by 16, you will shift right four times, so Chapter 8 - Shift and Rotate 59 ____________________________ you'll lose those 4 bits. But those bits are exactly the value of the remainder. All we need to do is: mov dx, ax ; copy of number to dx and dx, 0000000000001111b ; remainder in dx mov cl, 4 ; shift right 4 bits shr ax, cl ; quotient in ax Using a mask, we keep only the right four bits, which is the remainder. SAR SAR (shift arithmetic right) is different. It shifts right like SHR, but the leftmost bit always stays the same. This will make more sense when you run the program. Make another copy, call it SAR.ASM, and change the four instructions to SAR. The flags operate the same as for SHR and SHL. The overflow flag will never change since the left bit will always stay the same. First enter 74h (+116). We will be looking at the signed numbers only. Copy down the signed numbers as you go along. They should be: 116, 58, 29, 14, 7, 3, 1, 0, 0. Now try 8Ch (-116). The numbers you should get are: -116, -58, -29, -15, -8, -4, -2, -1, -1. They started out the same, then they got off by one. The negative numbers are one too negative. Try 39h (+57). The numbers here are: 57, 28, 14, 7, 3, 1, 0, 0, 0. Just as it should be for division by 2. Now try C7 (-57). Here the numbers are: -57, -29, -15, -8, -4, -2, -1, -1, -1. This time it went screwy right off the bat. Once again, the negative numbers are one too negative. SAR is an instruction for doing signed division by 2 (sort of). It is, however, an incomplete instruction. The rule for SAR is: SAR gives the correct answer if the number is positive. It gives the correct answer if the number is negative and the remainder is zero. If the number is negative but there is a remainder, then the answer is one too negative. The reason for this is a little complex, but we need to add some code if we want to do signed division.{2} For SHR, the remainder part was optional. Here it is not. We need to know whether the remainder is zero or not. For this example we will do a word shift left by 6. That's dividing by 64. remainder_mask dw 002Fh ; 63 call get_signed ; number in ax mov bx, ax ; copy in bx and bx, remainder_mask ; the remainder mov cl,6 ; shift right 6 bits sar ax, cl jns continue ; is it positive? ____________________ 2 Both the code and the reasons will be explained (but not proved) in the summary. The PC Assembler Tutor 60 ______________________ and bx, bx ; is the remainder zero? jz continue inc ax continue: We get the remainder, then shift right 6 bits. Upon finishing SAR, the sign flag will be set correctly. Here is yet another jump. This one is JNS (jump on not sign) jumps if the sign flag is NOT set, that is if the number is positive. If it is positive, then everything is ok so we skip ahead. If the number is negative, then we check to see if there was a remainder. If there wasn't, everything is ok, so we go ahead. If there was a remainder, then we INC (add 1) ax. Is the remainder correct? If the number was positive, the remainder is correct, but if the number was negative, then we need to do one more thing. After INC, but before 'continue' we have a SUB instruction: inc ax sub bx, 64 ; correct the remainder continue: Why that is the correct number will be explained in the summary. What a lot of work when we could simply write: mov cx, 64 call get_signed cwd ; sign extend idiv cx ; signed division Is there any advantage to this instruction? Not really. Remember that the more you shift, the longer it takes. If you shift 2, then it's about 1/3 faster than division. If you shift 14, then it is only 15% faster than division. Considering that even a slow PC can do 25000 divisions a second, you must be in serious need of speed to use this. In any case, you will never or almost never use SAR for signed division, while you will find lots of opportunity to use SHR and SHL for unsigned multiplication and division. ROR and ROL ROR (rotate right) and ROL (rotate left) rotate the bits around the register. We will just do one program since they operate the same way, only in opposite directions. Make another copy of SAL.ASM and put in ROR in the appropriate spots. Enter a number. This time you will notice that the bits, rather than dissapearing off the end, reappear on the other side. They rotate around the register. The only flags that are defined are OF and CF. OF is set if the high bit changes, and CF is set if a 1 bit moves off the end of the register to the other side. Do a few more, and we'll go on to the last two instructions. Chapter 8 - Shift and Rotate 61 ____________________________ RCR and RCL RCR (rotate through carry right) and RCL (rotate through carry left) rotate the same as the above instructions except that the carry flag is involved. Rotating right, the low bit moves to CF, the carry flag and CF moves to the high bit. Rotating left, the high bit moves to CF and CF moves to the low bit. There are 9 bits (or 17 bits for a word) involved in the rotation. Make yet another copy of the program, and change those 4 instructions to RCR. Also, since we have 9 bits instead of 8, change the loop count to 9 from 8: mov si, 9 Enter a number and watch it move. Before you start moving, look at CF and see if there is anything in it. There are only two flags defined, OF and CF. Obviously, CF is set if there is something in it. OF is wierd. In RCL (the opposite instruction to the one we are using), OF operates normally, signalling a change in the top (sign) bit. In RCR, OF signals a change in CF. Why? I don't have the slightest idea. You really have no need for the OF flag anyways, so this is unimportant. Well, those are the seven instructions, but what can you do with them besides multiply and divide? First, you can work with multiple bit data. The 8087 has a word length register called the status register. Looking at the upper byte: 15 14 13 12 11 10 9 8 X X X bits 11, 12 and 13 contain a number from 0 to 7. The data in this register is not directly accessable. You need to move the register into memory, then into an 8086 register. If you want to find what this number is, what do you do? mov bx, status_register_data mov cl, 3 ror bx, cl and bh, 00000111b we rotate right 3 and then mask off everything else. The number is now in BH. We could have used SHR if we wanted. Another 8087 register is the control register. In the upper byte it has: 15 14 13 12 11 10 9 8 X X a number from 0 to 3 in bits 10 and 11. If we want the information, we do the same thing: mov bx, control_register_data mov cl, 2 ror bx, cl The PC Assembler Tutor 62 ______________________ and bh, 00000011b and the number is in BH. You are now going to write a program that inputs an unsigned number and prints out its hex representation. Here it is: ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE mov ax_byte, 0A5h ; half regs, right ascii mov bx_byte, 4 ; hex mov dx_byte, 4 ; hex lea ax, ax_byte call set_reg_style call show_regs outer_loop: call get_unsigned mov bx, ax mov dx, ax mov cx, 4 inner_loop: push cx ; save cx mov cl, 4 rol bx, cl ; rotate left 1/2 byte mov al, bl ; copy to al and al, 0Fh ; mask off upper 1/2 byte cmp al, 10 ; < 10, 0 - 9 ; > 9 A - F jae use_letters add al, '0' ; change to ascii jmp print_it use_letters: add al, 'A' - 10 ; 10 = 'A' print_it: call print_ascii_byte call show_regs_and_wait pop cx loop inner_loop jmp outer_loop ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE AL will be shown in ascii while BX and DX will be in hex. We save the original number in DX. Since the first thing we want to print is the left hex character, we rotate left, not right. We move the low byte to AL, mask off everything but the low hex number and then convert to an ascii character. If it is 0 - 9, we add '0' (the character, not the number). If it is > 9, we add "'A' - 10" and get a letter (if the number is 10, we get 'A'). JAE means jump if above or equal, and is an unsigned comparison.{3} ____________________ 3 You are getting innundated with conditional jump instructions. Don't worry. As long as you understand each one when you run across it, you don't have to remember it. All jump instructions will be covered soon. Chapter 8 - Shift and Rotate 63 ____________________________ Finally, we print the ascii character that is in AL.{4} Another thing to notice is that just inside the loop we push CX. That is because we use CL for the ROL instruction. It is then POPped just before the loop instruction. This is typical. CX is the only register that can be used for counting in indexed instructions. It is common for indexing instructions to be nested, so you temporarily store the old value of CX while you are using CX for something different. push cx ; typical code for a shift mov cl, 7 shr si, cl pop cx Finally, let's multiply large numbers by 2. Here's the code: ; + + + + + + + + + + + + + + + START DATA BELOW THIS LINE byte1 db ? byte2 db ? byte3 db ? byte4 db ? error_message db "Result is too large.", 0 ; + + + + + + + + + + + + + + + END DATA ABOVE THIS LINE ; + + + + + + + + + + + + + + + START CODE BELOW THIS LINE outer_loop: lea ax, byte1 ; get 4 byte number call get_unsigned_4byte shl byte1, 1 rcl byte2, 1 rcl byte3, 1 rcl byte4, 1 jnc go_on lea ax, error_message call print_string go_on: lea ax, byte1 call print_unsigned_4byte jmp outer_loop ; + + + + + + + + + + + + + + + END CODE ABOVE THIS LINE This will require some explaination. Get_unsigned_4byte gets a number from 1 to four billion. We put it in memory. Normally, the following instructions would be done word by word. We are doing them byte by byte so you can see the mechanics of the situation. The low byte is shifted left 1 bit. This doubles it, but may shift a 1 bit from the high bit into CF. If it does, then it will be present when we rotate byte2. That moves CF into the low bit and moves the high bit into CF. We do it again. And again. If there is an unsigned overflow, it will be signalled by CF being ____________________ 4 Any subroutine in ASMHELP.OBJ that involves a one byte input or output has the data in AL. The PC Assembler Tutor 64 ______________________ set after: rcl byte4, 1 JNC (jump on not carry) will skip the error message if everything is ok. Print_string prints a zero terminated string, that is a C string which is terminated by the number (not the character) 0. Finally, we print the number. A word about large numbers in ASMHELP.OBJ. It is assumed that you would like to use commas if you could. Any data type over 1 word long allows commas. The following are considered the same by ASMHELP.OBJ in its input routines: 23546787 2,3,5,4,6,7,8,7 23,,5,46,,78,7 23,546787 23,546,787 It always prints commas correctly in the print routines. Chapter 8 - Shift and Rotate 65 ____________________________ SUMMARY All shift and rotate instructions operate on either a register or on memory. They can be either 1 bit shifts: sal cx, 1 ror variable1, 1 shr bl, 1 or shifts indexed by CL (it must be CL): rcl variable2, cl sar si, cl rol ah, cl SHL and SAL SHL (shift logical left) and SAL (shift arithmetic left) are exactly the same instruction. They move bits left. 0s are placed in the low bit. Bits are shoved off the register (or memory data) on the left side, and CF indicates whether the last bit shoved was a 1 or a 0. It is used for multiplying an unsigned number by powers of 2. SHR SHR (shift logical right) does the same thing as SHL but in the opposite direction. Bits are shifted right. 0s are placed in the high bit. Bits are shoved off the register (or memory data) on the right side and CF indicates whether the last bit shoved off was a 0 or a 1. It is used for dividing an unsigned number by powers of 2. SAR SAR (shift arithmetic right) shifts bits right. The high (sign) bit stays the same throughout the operation. Bits are shoved off the register (or memory data) on the right side. CF indicates whether the last bit shoved off was a 1 or a 0. It is used (with difficulty) for dividing a signed number by powers of 2. ROR and ROL ROR (rotate right) and ROL (rotate left) rotate the bits of a register (or memory data) right and left respectively. The bit which is shoved off one end is moved to the other end. CF indicates whether the last bit moved from one end to the other was a 1 or a 0. RCR and RCL The PC Assembler Tutor 66 ______________________ RCR (rotate through carry right) and RCL (rotate through carry left) rotate the bits of a register (or of memory data) right and left respectively. The bit which is shoved off the register (or data) is placed in CF and the old CF is placed on the other side of the register (or data). INC INC increments a register or a variable by 1. inc ax inc variable1 DEC DEC decrements a register or a variable by 1. dec ax dec variable1 The following is fairly technical. It is only for those willing to wade their way through a turgid explaination. If you don't understand it, forget it. CODE FOR SHL If you are shifting an UNSIGNED number right by 'X' bits, it is the same as dividing by (2 ** X) 1 bit = (2**1 = 2), 2 bits = (2**2 = 4), 7 bits = (2**7 = 128). This is the same as dividing by a number which is all 0s except the Xth bit which is 1 (for 0 we have 0000 0001, for 1 we have 0000 0010, for 3 we have 0000 1000, for 7 we have 1000 0000). The remainder mask will be this number minus 1 (for 0 we have 0000 0000, for 1 we have 0000 0001, for 3 we have 0000 0111, for 7 we have 0111 1111). CODE FOR SAR The order of numbers is important for SAR. If you start with 0 and add 1 each time, the actual sequence of signed numbers that you get (from the bottom up) is: -1 -2 . . -32767 -32768 +32767 +32766 . . 3 2 1 0 Chapter 8 - Shift and Rotate 67 ____________________________ The positive numbers are increasing in absolute value while the negative numbers are decreasing in absolute value. If you divide by shifting and there is no remainder, then the quotient is exact. If there is a remainder, the quotient will truncate towards 0 IN THE ABOVE DIAGRAM. This means that positive numbers will truncate down, while the negative numbers will truncate towards -32768, and will be one too negative. If the number was positive, the remainder will be positive and will be exactly the same as for SHR. If the number was negative, then things are more complicated. We'll take division by 32 as an example. If we divide by 32 (0010 0000) the remainder mask will be 31 (0001 1111). If the number is negative, then what we get when we AND the mask: and ax, 00011111b is not the remainder but (remainder + 32). In order to get the actual negative remainder, we need to subtract 32. This gives us (remainder + 32 - 32). remainder mask = divisor - 1 negative remainder correction = NEG divisor.